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Abstract 

Background: Several dynamic models of a gene regulatory network of the light-induced floral transition process in 
Arabidopsis have been developed to capture the behavior of gene transcription and infer predictions based on 
experimental observations. It has been proven that the models can make accurate and novel predictions, which 
generate testable hypotheses. 

Two major issues were addressed in this study. First, construction of dynamic models for gene regulatory networks 
requires the use of mathematic modeling that comprises equations of a large number of parameters. Second, the 
binding mechanism of the transcription factor with DNA is another factor that requires detailed modeling. The first 
issue was tackled by adopting an optimization algorithm, and the second was addressed by comparing the 
performance of three alternative modeling approaches, namely the S-system, the Michaelis-Menten model and the 
Mass-action model. The efficiencies of parameter estimation and modeling performance were calculated based on 
least square error {0(p)), mean relative error (MRE) and Akaike Information Criterion (AIC). 

Results: We compared three models to describe gene regulation of the flowering transition process in Arabidopsis. 
The Mass-action model is the simplest and has the least parameters. It is therefore less computation-intensive with 
the smallest AIC value. The disadvantage, however, is that it assumes the system is simply a second order reaction 
which is not the case in our study. The Michaelis-Menten model also assumes the system is homogeneous and 
ignores the intracellular protein transport process. The S-system model has the best performance and it does 
describe the diffusion effects. A disadvantage of the S-system is that it involves the most parameters. The largest 
AIC value also implies an over-fitting may occur in parameter estimation. 

Conclusions: Three dynamic models were adopted to describe the dynamics of the gene regulatory network of 
the flowering transition process in Arabidopsis. Based on MRE, the least square error and global sensitivity analysis, 
the S-system has the best performance. However, the fact that it has the highest AIC suggests an over-fitting may 
occur in parameter estimation. The result of this study may need to be applied carefully when modeling complex 
gene regulatory networks. 

Keywords: Dynamic model, Michaelis-Menten model, Mass-action model, S-system, Arabidopsis, Floral transition 
process 



Background 

Arabidopsis thaliana is a plant in the mustard family 
that has been frequently chosen as the organism model 
in research on plant science. It possesses small size, dip- 
loid genetics, small genome and relatively short gener- 
ation time. The life cycle of Arabidopsis from vegetative 
to reproductive growth is an important developmental 
step that is under tight genetic control. In the meanwhile 
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the floral transition state has shown to be integrated by 
a complex gene regulatory network. 

For Arabidopsis, floral organ specification has been suc- 
cessfully linked to spatial gene expression patterns according 
to floral transition and floral development. This model has 
five pathways that can explain various external (photoperiod, 
vernalization, ambient temperature) and internal (autono- 
mous, age, gibberellins) conditions to regulate the floral 
transition through an elaborated genetic network [1-5]. 

Recently, gene expression data sets have become avail- 
able for the genes involved in the regulation of floral 
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transition and flower development in Arabidopsis. In [6], 
time series of gene expression were presented for each 
class of genes in the floral transition group. For most 
floral transitions, the majority of which are members of 
Arabidopsis, namely APETALA (API) and LEAFY (LFY), 
are transcription factors where the way in which they 
are activated was known from experiments [7]. Further- 
more, it has been shown that in regulating the flowering 
time in Arabidopsis, these two information sources open 
the door for mathematical model development. 

To inference gene regulatory networks from time course 
data has been one of the main challenges in systems biol- 
ogy. In recent years, technological advances have driven the 
development of systems biology in experimental methods 
that generate in vivo time course data to characterize regu- 
latory interactions. In the last years there has been a sig- 
nificant increase of publications in the area of model 
construction. Some examples include: cell fate determin- 
ation in Arabidopsis flowers [8], model study of role pro- 
teins (CLV1, CLV2, CLV3 and WUS) in shoot apical 
meristem of Arabidopsis [9], integration of developmental 
and hormonal pathways in the Arabidopsis flower [10], 
and gene regulatory network models for plant develop- 
ment [11]. 

However, a major challenge with such models is that 
the detailed transcript binding process in a microscopic 
picture is usually unclear; therefore these models may be 
deviated from the reality. In addition, a dynamic model re- 
quires extensive mathematical formula and large amount of 
experimental data that are not available. Alternatively, a 
large-scale gene regulation model can be constructed based 
on stoichiometry without a large number of fitted parame- 
ters. Although these models can be used to predict the 
regulation behaviour using flux analysis, they failed to 
capture the transient behaviors of genes. For instance, 
Mahadevan [12] proposed the dynamic flux balance ana- 
lysis for situations where there is knowledge available; Yugi 
[13] proposed a method that aims to simplify the number 
of kinetic parameters in building a dynamic model. 

Many studies on dynamic simulation of gene regulation 
systems have been reported in the literature. Spieth [14] 
used linear weight matrices, S-system and H-systems model, 
and different optimization algorithms to model a non- 
linear dynamic system. Rafael et al. [15] compared Michaelis- 
Menten model, power-law and generalized mass action to 
represent an E. coil central carbon metabolic network. 

In this study, we modeled the regulatory interactions in 
the flowering of Arabidopsis with a series of kinetic func- 
tions. The first case considers the conditions that mRNA 
is produced immediately after transcript factor binding. 
This process is formulated as a mass action model. The 
second case assumes that a complex state is formed be- 
tween the transcription factor and its target gene. The 
production of mRNA is delayed due to the stability of the 



complex state. This process is formulated as a Michaelis- 
Menten model. The third case assumes that the binding 
process of the transcript factor is limited by 1-D and 3-D 
diffusions, and the production of mRNA is dominated 
by a diffusion-reaction process. Accordingly S-system was 
adopted in this study to model such a mechanism. 

Results 

Comparison of the models 

Table 1 presents the reaction mechanisms depicted in 
the flowering time regulatory network of Arabidopsis. 
This gene regulatory network describes the flowering time 
(Photoperiodic) in Arabidopsis thaliana. The core of this 
regulatory network includes: 

1 The photoperiod activates the CO gene. 

2 After CO activates the expression of FLOWERING 
LOCUST (FT) probably by binding to the FT 
regulatory regions and the bZIP transcription factor 
FD, the FT/FD complex activates the expression of 
SOCL 

3 SOC1 and AGL24 form a positive feedback loop and 
up-regulate LFY. 

4 API and LFY are ultimately resolved in the up- 
regulation of the floral meristem identity genes. 

In this study, we compared three dynamic models to 
reconstruct the behaviour of the flowering time regula- 
tory network of Arabidopsis. The governing differential 
equations are listed in the following: 

Table 1 A dynamic model for the concentration of gene 
complexes of the flowering time regulatory network of 
Arabidopsis according to the reaction scheme depicted 
in Figure 4 





Variable 


Symbol 


Description 


Reference 


Independent 












X8 


FD 


Basic-leucine zipper 
(bZIP) transcription 
factor family protein 


[1] 




X9 


PHYB 


Phytochrome B 


[2] 


Dependent 












X1 


CO 


Zinc finger protein 
CONSTANS 


[3] 




X2 


FT 


Protein FLOWERING 
LOCUS T 


[4] 




X4 


S0C1 


MADS-box protein 
transcription factor S0C1 


[5] 




X5 


AP1 


Floral homeotic protein 
APETALA 1 


[6] 




X6 


AGL24 


MADS-box protein 
AGL24 


[7] 




X7 


LFY 


Protein LEAFY 


[8] 
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S-system 
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X4 
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X6 
XI 
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XS and X9 are constants. 



Micharlis-Menten model 



XI 



X2 



X3 



X4 - 



X5 



X6 



XI 



-dki[Xl] 
-dk 2 [X2] 



Vm v [X9] 

Km + [X9] 
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Vm 3 -lX2}[X8} 

: /^ 3 + [X2][X8]-^ 3[X3] 

/ Vm^ a \X3] Vm^ b \X6] 
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Vm 5 -[X3] 

' Km 5 + [X3] 
Vm 6 \X^] 

: Km 6 + \XA] 
Vrn 7 .[X6] 

' Km 7 + [X6] 



-d!u[X4] 



-dk 5 [X5] 
-dk 6 [X6] 
-dk 7 [X7] 



XS and X9 are constants. 
Mass-action model 

XI = kn[X9]-dki[Xl] 

X2 = kr 2 [Xl]-dk 2 [X2] 

X3 = kr 3 [X2][X%]-dk 3 [X3] 

X4 = (kr[X3), a MX6U)-dh[X^\ 

X5 = kr 5 [X3]-dk 5 [X5] 

X6 = kr 6 [X^}-dk 6 [X6] 

XI = kr 7 [X6]-dk 7 [X7] 

X8 and X9 are constants. 

Table 2 summarizes the total number of parameters 
used in each model. Due to the complex nature of the 
S-system model, 31 parameters were used to describe 
the floral transition pathway, whereas the Micharlis- 
Menten model and the Mass action model used 23 and 



Table 2 The total number of parameters in each of the 
three models used in this study 



Total number of parameters 



Model 

S-system 
31 



Mass action 
15 



Michaelis-Menten 
23 



15 parameters, respectively. Among them the S-system 
used the most parameters because the reaction rate 
was described by a non-linear function for the react- 
ant concentration. 

Parameter estimation has been considered as a reverse en- 
gineering problem, which may be performed based on local 
optimization methods and global optimization methods. 
In this study, three different optimization methods were 
employed including local HJ (local optimization), EP and 
PSO (global optimization). The 0(p) and MRE were used 
to measure the quality of the fit for the estimated parame- 
ters. The values of Ofpjcalculated for the three models 
and the three optimization methods with four experimen- 
tal data sets are shown in Figure 1. The result suggests 
that the PSO method was most suitable for our dynamic 
models. 

The values of the MRE calculated for the three models 
and the four experimental data sets are presented in 
Table 3. The result suggests that the S-system model 
could achieve the best performance compared with the 
other two models, as the S-system has the smallest mean 
relative error (shown in Figure 2). 

The AIC calculated for the three models, namely the S- 
system (col, 53.0331; fer, 52.6223; co, 46.2319; ft, 49.6211), 
Michaelis-Menten model (col, 30.1816; lev, 38.1605; 
co, 24.6298; ft, 34.1275) and Mass-action model (col, 
26.1364; fer, 26.5598; co, 10.8465; /£, 22.8427), are pre- 
sented in Table 4. The result suggests that among the 
three, the Mass-action model is the simplest and has the 
least parameters. It is therefore less complex with the 
smallest AIC value. 

These results suggest that the S-system model repre- 
sents an option to simulate the dynamics of our gene 
regulatory network. It is understood that a limitation of 
the S-system model is that the parameters may not be 
identifiable when the concentrations and reaction rates 
are very small. However, considering that the cell interior 
is homeostatic, this condition is unlikely to occur during 
the flowering time of Arabidopsis. Another possible diffi- 
culty with the S-system is the low sampling intervals of the 
genes required for parameter identification, which is also a 
challenge for all other kinetic models. For parameter esti- 
mation we assumed that all the 8 genes are measurable. 

The estimated parameter values are listed in Additional 
file 1. Figure 3 shows the simulated time course data for 
the following genes: CONSTANS (CO), FLOWERING 
LOCUS T (FT), protein FD (FD), SUPPRESSOR OF 
CONSTANS OVEREXPRESSION 1 (SOC1), APETALA1 
(API), AGL24 and LEAFY (LEY). It is noted that the 
discrepancies between the S-system predicted values 
and the mRNA expression levels were relatively small for 
all of the modeled genes, suggesting that it may success- 
fully replace the other two models to simulate time course 
data. 
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SS_C0L_P3O MM_00L_FSO MS_C0l_PS0 SS_C0L_HJ MlJ.COiJU MS_COi._HJ SS_COL_EP MMCtX^EP USCOt^EP 




83_LER_PSO M*UJERJ>SO MSJ.EfU>30 S3_LEJUiJ MM_LJER_HJ MS_IERJ« 83_L£«JP MM_LER_EP MSJJEHJ3P 




ss_coj>so mjnjso mjXJIO ss^coju mjzojv u3_cojij 



SS_COJEP WLCOJP 




Figure 1 An analysis of the O(p) calculated for the three models and three optimization methods in four experimental data sets: (A) O(p) 
calculated for the col experimental data; (B) O(p) calculated for the ler experimental data; (C) O(p) calculated for the co experimental data; and 
(D) O(p) calculated for the ft experimental data. 



Performance of the three models 

In this study, we compared three alternative models: 
the S-system, Michaelis-Menten model, and Mass-action 
model, for the flowering time regulatory network of 
Arabidopsis. Both the Michaelis-Menten model and Mass- 
action model ignore the diffusion effect in the reaction. The 
Mass -action model assumes that the TF initiates mRNA 



transcription immediately, whereas the Michaelis-Menten 
model describes TF binds with DNA first and with the 
active mRNA later. The S -system models this process by 
adopting the ID and 3D diffusion-reaction mechanisms. 
The ID diffusion-reaction mechanism assumes the TF 
binds with the target site then activates the mRNA. The 3D 
diffusion-reaction mechanism describes that the TF binds 
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Table 3 The means and standard deviations of MRE calculated for the S-system, Michaelis-Menten model and Mass- 
action model 

Model col ler co ft 

S-System 0.0325 ± 0.0131 0.0380 ± 0.0193 0.0490 ± 0.0297 0.0312 ± 0.0109 

Michaelis-Menten model 0.1 294 ± 0.1 801 0.1 542 ± 0.1 595 0.2746 ± 0.3954 0.0775 ± 0.051 7 

Mass action model 0.1 295 ± 0.1 570 0.1 738 ± 0.1 654 0.2266 ± 0.2900 0.0899 ± 0.0628 



with the DNA first then diffuses along the DNA to search 
for the target site, activating the mRNA transcription 
process during the final stage. 

Shown in Figure 1 and Figure 2 are the MRE and O(p) 
calculated for the seven floral transition genes. The re- 
sults indicate that the S-system, when compared with 
the real experimental data, achieved the best prediction. 

Sensitivity analysis of the models 

Sensitivity analysis can be applied to estimate the effect 
of parameter changes, while MRE gives an estimate of a 
models rate of change based on local sensitivity analysis. In 
this section, we report the results of the time dependent 
sensitivity analysis [Eq. 8] for a time period of 100 seconds. 

As shown in Figure 4, it is obvious the sensitivity measure 
is positive for every gene. This implies that the mRNA 
concentration increases due to perturbation. The reason is 
that all of the interactions were activations. The results of 
the S-system show that fluctuations are limited to the first 
10 seconds only, which is relative short compared to 50 
seconds for the Michaelis-Menten model or mass action 
model (see Additional file 2 and Additional file 3). This in 
turn suggests that the response is a transient effect at most. 
A sensitivity value near zero means that gene activity is not 
sensitive to parameter perturbation at all. For a given gene, 
the response curves for the rate constants reflect the effect 
of perturbation on the transcription rate. For the kinetic 
order response, the sensitivity results indicate the effects on 
the strength of the activation or suppression. 

Although an FT/FD complex formation measurement 
was not available, as long as we assumed a steady state 



approximation, i.e., [FD/FT] = k[FD] [FT], the S-system 
was still capable of giving a reasonable fitting for the 
complex's regulated genes: SOC1 and API. 

In this study, we focused on the experiments for fitting 
the parameters. With the sensitivity analysis we identified 
the most sensitive parameters and sampling time intervals; 
this may provide some directions for future experimental 
design for model refinement. 

Discussions 

Production of mRNA is dominated by the binding process 
between TF and its target gene. This binding process can 
be described by a diffusion- reaction mechanism. Although 
the Michaelis-Menten model has been widely used in gene 
regulation models, the results based on the MRE values 
show that the S-system is a better dynamic model for de- 
scribing the flowering time of Arabidopsis. But the AIC 
values for the S-system (shown in Table 4) were larger 
than those of the Mass-action model, which implies an 
over-fitting may occur in parameter estimation. 

The deviation is significant between the simulated data 
and experimental data for genes AGL24 and SOC1 in all 
the three models. These two genes form a feedback loop 
in the regulatory network; therefore such interactions 
degrade the performance of the three models. For a more 
complex regulation model, additional factors should be 
considered in the transcription mechanism. The values ob- 
tained from the sensitivity analysis were all positive, which 
indicate that the mRNA production rate is proportional to 
the collision frequency as well as the binding force 
between the TF and its target gene. 




ler 



co 



col 



SS MM MS SS1 MM.1 MS 1 SS 2 MM 2 MS.2 

Figure 2 The MRE calculated for the three models with four experimental data sets. 
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Table 4 The Akaike Information Criterion (AIC) calculated 
for the S-system, Michaelis-Menten model and Mass ac- 
tion model 



Model 


col 


ler 


CO 


ft 


S-system 


53.0331 


52.6223 


46.2319 


49.6211 


Michaelis-Menten model 


30.1816 


38.1605 


24.6298 


34.175 


Mass action model 


26.1364 


26.5598 


10.8465 


22.8427 



One may suggest that the performance effect is more due 
to the modeling technique rather than the equation system. 
This study only compared three possible models for one 
dynamic system. As the Mass-action model and Michaelis- 
Menten model techniques do not consider diffusion, they 
did not perform as well as the S-system. While the 



S-system seems to be most promising, a general conclusion 
that it is a better approach for modeling complex large- 
scale networks is yet to be established. 

We used three reaction mechanisms to describe the 
process that a transcription factor binds on the promoter 
to generate mRNA. The Mass-action model assumes that 
this is a simple second-order reaction; the Michaelis- 
Menten model assumes that there is an intermediate prod- 
uct; and the S-system considers the diffusion effects in one 
dimension and three dimensions. Among the three, the 
Mass-action model is the simplest and has the least param- 
eters. It is therefore less computation-intensive with the 
smallest AIC value (see Table 4). The disadvantage, how- 
ever, is that it assumes the system is homogeneous which 



8 




2 3 

Time (day) 




Time (day) 



Time (day) 



Time (day) 





Time (day) 



Figure 3 A comparison of the simulated regulation of the flowering time in Arabidopsis (0, 3, 5, and 7 day)CONSTANS (CO), FLOWERING 
LOCUS T (FT), Protein FD (FD), SUPPRESSOR OF CONSTANS OVEREXPRESSION 7 (SOC1), APETALA 1 (API), AGL24 and LEAFY (LFY) for the 
expression data of ler (black solid line) and other models (S-system — — — — , Mass-action model — ■ — , Michaelis-Menten 
model A ). 
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Figure 4 Time-dependent sensitivity analysis of the parameters in the S-system, where a and b are system function parameter vectors 
(alpha and beta) consisting of rate constants, and g and h are kinetic orders for genes CONSTANS (CO), FLOWERING LOCUS T (FT), 
Protein FD (FD), SUPPRESSOR OF CONSTANS OVEREXPRESSION 7 (SOC1), APETALA1 (API), AGL24 and LEAFY (LFY). 



is not the case in our study. The Michaelis-Menten model 
also assumes the system is homogeneous and ignores 
the intracellular protein transport process. The S -system 
model does describe the diffusion effects which is the 
main driving force for mass transport in the cell. A disad- 
vantage of the S- system is that it involves the most 
parameters. The largest AIC value of the S -system 
also implies an over-fitting may occur in parameter 
estimation. 

We also considered the use of the Hill equation. In 
general the Hill equation is employed to describe the 
phenomenon that binding of a ligand to a macromol- 
ecule influencing the other ligands binding on the same 
macromolecule, which is known as cooperative binding. 
The Hill coefficient is used to quantify this effect, where 
a value of 1 indicates a completely independent binding, 
a value greater than 1 indicates a positive cooperativity, 
and a value less than 1 indicates a negative cooperativity. 
That is to say, a plurality of the same or different 



transcription factors are bound in the promoter region 
of a gene, and the first transcription factor affects the 
subsequent transcription factors in the promoter re- 
gion. However, in our gene regulatory system, the num- 
ber of transcription factors for the regulating genes is 
mostly one, and all the transcription factors have only one 
binding site on the promoter region. Therefore the Hill 
equation was not applicable in our study. 

Conclusions 

One of the major problems of establishing large-scale 
dynamic models is the lack of experimental data. The 
parameters are usually unknown, so are the specific reac- 
tion rate laws. Moreover, for a large number of reactions, 
the parameters are only available in the literature whose 
values have to be obtained in in vitro conditions. 

In this study, we focused on the molecular mechanism 
of transcription to propose models for describing the 
gene regulatory interactions of the flowering transition 
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processes in Arabidopsis. The S -system has the best per- 
formance. Although we assumed that the best perform- 
ance had come from the consideration of diffusion effects, 
its highest AIC values indicated a possible over-fitting in 
parameter estimation. It is therefore necessary to carefully 
apply the S-system for modeling complex gene regulatory 
networks. The diffusion effects may as well be included 
in the parameters for the Mass action model and the 
Michaelis-Menten model. 



Methods 

The regulation of flowering time network 

The state transition to flowering in plants is precisely 
controlled by environmental conditions and endogenous 
developmental cues so that plants produce their progeny 
under favorable conditions. The response to multiple fac- 
tors suggests the existence of a complex network regulat- 
ing this state transition in plants. In this study, the biology 
of flowering time (Photoperiodic) in Arabidopsis thaliana 
showed that a complex gene regulatory network that con- 
trols this transition integrates the responses based on vari- 
ous external and internal conditions. Consequently, the 
regulation of the flowering time has been a major adaptive 
trait during plant evolution and domestication. A large 
number of genes have been characterized as flowering time 
regulators, and several recent reviews have provided de- 
tailed descriptions of flowering time pathways [2] . 

Arabidopsis is a facultative or quantitative long day 
plant that can flower, albeit much later in a short day. Key 
regulatory genes appear to be conserved in Arabidopsis. 
A short day plant suggests that common pathways are 
utilized [16,17]. The plant perceives photoperiod and is 
transduced to a downstream signaling system. The light- 
dependent pathway controls the flowering in response to 
seasonal changes. 

Figure 5 shows the gene regulatory pathway for the flow- 
ering transition pathway in Arabidopsis [1], which is medi- 
ated by CONSTANS (CO): 



1 CO codes for a zinc finger and CCT domain- 
containing transcription factor that accumulates 
under long day conditions in leaves as a result of the 
combination of the rhythmic expression of CO mRNA. 

2 CO activates the expression of FLOWERING 
LOCUST (FT) probably by binding to the FT 
regulatory regions [18,19]. The FT protein is a 
component of the mobile flowering signal that 
moves upon its expression in the vascular tissue of 
leaves to the shoot apex [20,21]. 

3 At the meristem, FT physically interacts with the 
bZIP transcription factor FD and the FT/FD 
complex and activates the expression of SOC1 [22]. 

4 SOC1 and AGL24 form a positive feedback loop, 
and the two factors might form a complex for the 
up-regulation of LFY. Thus, the regulators of the 
floral transition form a small network with multiple 
interactions among themselves, 

5 API and LFY are ultimately resolved in the 
up-regulation of the floral meristem identity genes. 

We used three different models for the flowering transi- 
tion pathway in Arabidopsis to reconstruct the experimen- 
tal data. 



Experimental data 

We obtained the experimental data from: 

(1) Plant Expression Database (http://www.plexdb.org/), 
ID: AT4; 

(2) NCBI GEO, ID: GSE576 and GSE577. 

Both contain the microarray data of the eight genes 
included in our study. Between the two, AT4 discusses 
the flower development of Arabidopsis thaliana; it was 
controlled by several signaling pathways which con- 
verge on a set of genes such as FT and SOC1 that function 
as pathway integrators. The photoperiod is regarded 



I Photoperiod 



SOCl AGL24 



CO FT FD FT/FD 



Apex 



Figure 5 Photoperiod pathway for the flowering transition process in Arabidopsis. 



API 



LFY 
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the most important factor in promoting floral transition: 
Arabidopsis thaliana will flower earlier under long day con- 
ditions than under short day conditions. It is therefore con- 
sidered a facultative long day plant. To monitor changes of 
gene expression during floral transition and early flower de- 
velopment, plants were grown under short day (9 hr light, 
15 hr dark) for 30 days and then shifted to long day (16 hr 
light, 8 hr dark) to induce flowering. The RNA was isolated 
from micro-dissected apical tissue harvested 0, 3, 5, and 7 
days after hybridized to the Affymetrix ATH1 microarray. 

We not only analyzed the reopens of Columbia {col) 
and Landsberg erecta (ler) but also the effect of mutants in 
the flowering time genes CONS TANS (co) and FT (ft). In 
this study, we used four experiments for parameter esti- 
mation: (1) the Columbia (col) is the most widely-used 
wild type of Arabidopsis thaliana; (2) The Landsberg 
erecta (ler) is currently the second most widely-used ac- 
cession of Arabidopsis thaliana; finally (3) CONSTANS 
(co) and (4) FT (ft) are mutants in the flowering time [6]. 

We used four different experimental data sets based 
on optimized parameter estimation to find the most ap- 
propriate model. 

Dynamic model 

Before introducing the transcription mechanism of the 
binding process, the following assumptions were made 
in advance: (a) Transcription is initiated when all the ac- 
tivation binding sites are occupied, and all the repression 
binding sites regarding the same gene are empty; and 
(b) The cell size remains constant during the time course 
of flowering state transition. 

S-system 

Most biological systems are nonlinear. Although the 
Michaelis-Menten model has been widely used to approach 
biological systems, one of the disadvantages lies in the fact 
that it is not an explicit mathematical form for all cases, es- 
pecially for complex processes such as diffusion-reaction 
interactions. The S-system consists of a set of mathemat- 
ical terms that is sufficient to capture most of the nonlin- 
ear phenomena in nature including diffusion-reaction 



interactions. The development of the S-system [23] has 
been shown to provide a good approximation for many 
cases, and there are efficient procedures for estimating the 
parameter values [24,25]. Protein-DNA interactions such 
as the binding between a transcription factor and its target 
gene have been studied for many years. Experimental ob- 
servations have promoted the proposition that the diffu- 
sion effects in 3-D and 1-D crawled along the DNA are 
critical in the binding process. Early studies have yielded 
an unexpected result that the binding rate for the Lac re- 
pressor protein to its binding site on DNA is approxi- 
mately 100-1000 times faster than the maximal 3-D 
diffusion rate in solution [26]. This phenomenon is called 
facilitated diffusion [27]. A picture of facilitated diffusion 
is schematically shown in Figure 6. 
The process can be described by the reaction 



TF ns +BS^mRNA 

where TF represents the transcription factor, TF ns repre- 
sents the non-specific absorbed transcription factor on 
the DNA, and BS represents the binding site. The first 
step in this reaction is absorption of the transcription 
factor on the DNA. The second step in this reaction is 
mRNA production after the absorbed transcription fac- 
tor has bound to its target gene. The mRNA production 
rate can be formulated as [28]: 

where \ is the average length of the transcription factor 
that moves along the DNA, L is the total length of the 
DNA, and x is the sum of the 3-D diffusion time and the 
retention time on the DNA for the transcription factor. 

The Langmuir isotherm is not suitable for describing 
protein adsorption, because the diverse conformations 
and multiple absorbed sites in the absorption process [29] . 
The Freundlich adsorption isotherm is more concordant 
with the experimental observations of proteins absorption. 
Proteins absorption is strongly dependent on the bulk 
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Figure 6 Protein searches the target on DNA. The kon and koff are adsorption and dissociation rate constants for protein and X is the average 
length that each protein moves along DNA. 
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concentration of proteins. In this case, the Freundlich ad- 
sorption isotherm of transcription factor on the DNA can 
be expressed as 



[TF m ] = K[TF] 



l/n 



(2) 



where K and n are constants at a particular temperature. 
From equations (1) and (2), the mRNA production rate 
can be determined as: 



Xk 
Lt 



\TF 



l/n 



(3) 



It can be transformed to the mathematic form of the 
S-system. For this reason, we adopted the S-system func- 
tion to describe the diffusion-reaction interactions in the 
mRNA transcription process. In the case of mRNA tran- 
scription, the S -system equation for the transcription rate 
is given by: 



-a l l[[TF,r>-/l i l[[TF,]> 



(4) 



where n is the number of transcription factors, TFj is the 
j-th transcription factor that regulates gene z, a t and 
denote the positive rate constants, and gy and hy are re- 
ferred to as the kinetic orders. If gy > 0, gene / will in- 
duce the expression of gene i. On the contrary, gene / 
will inhibit the expression of gene i if gy < 0. The variable 
hij has the opposite effects in controlling the gene ex- 
pression compared to gij. In the present study, the range 
of gij and falls between 0 and 2. 

Michaelis-Menten model 

The Michaelis-Menten model describes a catalysed reac- 
tion in which an intermediate complex forms after binding 
between enzyme and substrate. Thereafter, the complex 
(TF-BS) converts to the product and enzyme. In the case 
of mRNA transcription, the transcription process via such 
mechanism may be represented as: 

TF + BS++TF-BS 

TF-BS^mRNA 

Under the condition [TF]> > [BS], the production rate of 
mRNA for a gene with the diffusion effect ignored can be 
formulated as: 



Vn 



\TF] 



' K m + [TF] (5) 
where V max is the maximum production rate of mRNA 
and K m is the Michaelis constant. The delay effect on 
the mRNA production increases with the stability of the 
complex state. 



Mass-action model 

The chemical mechanism of Mass-action model states 
that the reaction rate is proportional to the product of 
the active mass of reactants. In the case of mRNA tran- 
scription, the reactants correspond to the transcription 
factor (TF) from the upstream gene and its specific bind- 
ing site (BS) on the downstream gene. The transcription 
process in the cell may be represented as: 

TF + BS^mRNA 

Because the total number of binding sites for a specific 
gene is fixed, the production rate of mRNA of the down- 
stream gene with diffusion ignored can be formulated as: 



v = k[TF] 



(6) 



where k is the rate constant and [TF] represents the con- 
centration of the transcription factor. The delay effect on 
the mRNA production is assumed to be zero. 

Parameter estimation 

The objective of parameter estimation is to adjust the par- 
ameter values of a model via an optimization procedure 
so that the predictions based on the model can closely ex- 
press the observation data. Parameter estimation can be 
performed through global methods and local methods 
[30]. However, one of the major challenges in modeling 
large-scale dynamic systems has been the existence of sev- 
eral local minima in the solution space. In this study, par- 
ameter estimation was performed by using the software 
tool "Complex Pathway Simulator" (COPASI Ver. 4.8) to 
fit the time series experimental data based on a dynamic 
model [31]. Evolutionary Programming (EP), Hooke & 
Jeeves (HJ) and Particle Swarm Optimization (PSO) were 
applied to search for an optimal solution, which may not 
converge to the minimum with different initial guesses. 
Among the three, EP [32,33] was inspired by biological 
evolution, PSO [34] was inspired by social behavior and 
the movement dynamics of insects, birds and fishes, and 
HJ [35] was derived based on a hill climbing technique. 

These algorithms possess key advantages in large inverse 
problems of quantitative mathematical models [36]. The 
goodness of fitting for each set of estimated parameters 
can be quantified by the least squared error O: 



0{p) = J2J2co i (X iJ -Y Q {p)f 



(7) 



i=\ ;=1 



where p is the tested parameter set, n and t are the num- 
ber of genes in the regulatory network and the number 
of samples in the time series data, respectively; (p) is 
the prediction time series data by the dynamic model for 
the parameter set p; and Xy is the experimental time 
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series data. The weighting factor is given by the mean 
square: o)i = i— — . 



Model ranking and selection 

In this study, we compared three models for the gene 
regulatory network of the flowering time regulation in 
Arabidopsis. We used the mean relative error (MRE) to 
quantify the response of each model subject to small 
perturbations [37]: 



MRE 



(8) 



where denotes the experimental time series data for 
the z'-th gene at time point j, y i} j is the simulation data 
for the z'-th gene given by the model at time point j, n is 
the total number of genes, and t is the number of sam- 
ples in the time series data. 

Parameter sensitivity analysis 

Dynamic models have been widely used to study metabolic 
networks and gene regulatory networks. These models 
are used to reconstruct experimental data and predict un- 
observed behaviors of a biological system. To address the 
many sources of uncertainty including error and noise in 
the experimental data, sensitivity analysis may be per- 
formed to identify the parameters in a model that have 
the strongest effect on the overall behavior. 

Sensitivity analyses have the primary goal of determin- 
ing how a given model responds to variations in a param- 
eter. Local sensitivity analysis focuses on a particular point 
in the parameter space by changing one parameter at a 
time to obtain a local response of the model. Global sensi- 
tivity analysis tries to capture the entire parameter space 
all at once, allowing multiple parameters to be explored 
simultaneously [38]. 

In this study, we used the SBML-SAT software tool for 
Multi-Parametric Sensitivity analysis (MPSA) [39]. An 
MPSA analysis of the time dependent normalized sensi- 
tivity response is defined as: 



dlnXj 



(9) 



where Xjis the mRNA concentration of the y-th gene, 
and pi is the z-th parameter in the dynamic model. 

Akaike information criterion 

We compared three models for flowering time regula- 
tion in Arabidopsis, We used several parameter estimation 
methods to estimate the parameters of the dynamic 
models. 

Parameter Estimation helps us quantify the regulatory 
abilities of the genes involved at the flowering time. In 



order to determine whether a dynamic model is optimal, in 
this study a statistical approach called Akaike Information 
Criterion (AIC) [40] was employed to validate the number 
of model parameters and determine the significance of the 
parameters. 

The Akaike Information Criterion (AIC), which attempts 
to include both the estimated residual variance and the 
model complexity in one statistic, decreases as the residual 
variance S e decreases, and it increases as the number 
of parameters p increases. For a gene regulatory model 
with p regulatory parameters to fit with a dataset of N 
samples, the Akaike Information Criterion (AIC) can be 
written as [40]: 



AIC = 2k-2ln(L) 



(10) 



where L is the likelihood of the mathematical model and 
p is the number of parameters in the model. 
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